Supplementary Materials for “Distributed Newton Methods for Deep Neural Networks”
نویسندگان
چکیده
Notation Description y The label vector of the ith training instance. x The feature vector of the ith training instance. l The number of training instances. K The number of classes. θ The model vector (weights and biases) of the neural network. ξ The loss function. ξi The training loss of the ith instance. f The objective function. C The regularization parameter. L The number of layers of the neural network. nm The number of neurons in the mth layer. n0 The number of input neurons (the dimension of the feature vector). nL The number of output neurons (the number of classes, except for binary classification one may use nL = 1). W The weight matrix in the mth layer (with dimension <nm−1×nm). w tj The weight between neuron t in the (m − 1)th layer and neuron j in the mth layer. w The vector obtained by concatenating the columns of W. b The bias vector in the mth layer. s The affine function (Wm)Tzm−1,i+b in themth layer for the ith instance. z The output vector (element-wise application of the activation function on s) in the mth layer for the ith instance. σ The activation function. n The total number of weights and biases. J i The Jacobian matrix of z with respect to θ. J i p The local component of the J i in the partition p.
منابع مشابه
Distributed Newton Methods for Deep Neural Networks
Deep learning involves a difficult non-convex optimization problem with a large number of weights between any two adjacent layers of a deep structure. To handle large data sets or complicated networks, distributed training is needed, but the calculation of function, gradient, and Hessian is expensive. In particular, the communication and the synchronization cost may become a bottleneck. In this...
متن کاملOn the convergence speed of artificial neural networks in the solving of linear systems
Artificial neural networks have the advantages such as learning, adaptation, fault-tolerance, parallelism and generalization. This paper is a scrutiny on the application of diverse learning methods in speed of convergence in neural networks. For this aim, first we introduce a perceptron method based on artificial neural networks which has been applied for solving a non-singula...
متن کاملA multi-scale convolutional neural network for automatic cloud and cloud shadow detection from Gaofen-1 images
The reconstruction of the information contaminated by cloud and cloud shadow is an important step in pre-processing of high-resolution satellite images. The cloud and cloud shadow automatic segmentation could be the first step in the process of reconstructing the information contaminated by cloud and cloud shadow. This stage is a remarkable challenge due to the relatively inefficient performanc...
متن کاملA Hybrid Neural Network Approach for Kinematic Modeling of a Novel 6-UPS Parallel Human-Like Mastication Robot
Introduction we aimed to introduce a 6-universal-prismatic-spherical (UPS) parallel mechanism for the human jaw motion and theoretically evaluate its kinematic problem. We proposed a strategy to provide a fast and accurate solution to the kinematic problem. The proposed strategy could accelerate the process of solution-finding for the direct kinematic problem by reducing the number of required ...
متن کاملCystoscopy Image Classication Using Deep Convolutional Neural Networks
In the past three decades, the use of smart methods in medical diagnostic systems has attractedthe attention of many researchers. However, no smart activity has been provided in the eld ofmedical image processing for diagnosis of bladder cancer through cystoscopy images despite the highprevalence in the world. In this paper, two well-known convolutional neural networks (CNNs) ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2018